This worksheet has a draft implementation of the time projected network, and does some distance based calculations with it in order to cross-check the assumptions of the earliest-temporal path algorithm. It draws heavily on Jim Moody’s draft paper Static Representations of Dynamic Networks
library(tsna)
## Loading required package: network
## network: Classes for Relational Data
## Version 1.13.0 created on 2015-08-31.
## copyright (c) 2005, Carter T. Butts, University of California-Irvine
## Mark S. Handcock, University of California -- Los Angeles
## David R. Hunter, Penn State University
## Martina Morris, University of Washington
## Skye Bender-deMoll, University of Washington
## For citation information, type citation("network").
## Type help("network-package") to get started.
##
## Loading required package: networkDynamic
##
## networkDynamic: version 0.9, created on 2015-11-24
## Copyright (c) 2015, Carter T. Butts, University of California -- Irvine
## Ayn Leslie-Cook, University of Washington
## Pavel N. Krivitsky, University of Wollongong
## Skye Bender-deMoll, University of Washington
## with contributions from
## Zack Almquist, University of California -- Irvine
## David R. Hunter, Penn State University
## Li Wang
## Kirk Li, University of Washington
## Steven M. Goodreau, University of Washington
## Jeffrey Horner
## Martina Morris, University of Washington
## Based on "statnet" project software (statnet.org).
## For license and citation information see statnet.org/attribution
## or type citation("networkDynamic").
##
## Loading required package: ergm
## Loading required package: statnet.common
##
## ergm: version 3.6-14181.1-2015.11.12-22.35.00, created on 2015-11-12
## Copyright (c) 2015, Mark S. Handcock, University of California -- Los Angeles
## David R. Hunter, Penn State University
## Carter T. Butts, University of California -- Irvine
## Steven M. Goodreau, University of Washington
## Pavel N. Krivitsky, University of Wollongong
## Martina Morris, University of Washington
## with contributions from
## Li Wang
## Kirk Li, University of Washington
## Skye Bender-deMoll, University of Washington
## Based on "statnet" project software (statnet.org).
## For license and citation information see statnet.org/attribution
## or type citation("ergm").
##
## NOTE: If you use custom ERGM terms based on 'ergm.userterms'
## version prior to 3.1, you will need to perform a one-time update
## of the package boilerplate files (the files that you did not write
## or modify) from 'ergm.userterms' 3.1 or later. See
## help('eut-upgrade') for instructions.
Define a quick/crude implementation of the time projected graph with arbitrary discrete intervals/times. Does not implement the condensed/compressed version Jim suggests.
# create a crude version of discritized time projected graph from nD object
timeProjectedNetwork<-function(nd,start = NULL, end = NULL, time.increment = NULL,
onsets = NULL, termini = NULL, ...){
# TODO: initilize vertex pid to for case when of vertex dynamics change sizes of nets
# slice up the dynamic network into a list of static networks
nets<-get.networks(nd,start = start, end = end, time.increment = time.increment,
onsets = onsets, termini = termini, ...)
# init a new network of appropriate size
projected<-network.initialize(length(nets)*network.size(nd))
# TODO: map old vertex labels for debugging
network.vertex.names(projected)<-unlist(network.vertex.names(nd),rep,network.size(nd))
# loop over edges to create within-slice edges
for(s in seq_along(nets)){
el<-as.matrix.network.edgelist(nets[[s]])
# multiply the vertex indices by s to give them for the next slice
el<-el+(network.size(nd)*(s-1))
add.edges(projected,tail=el[,1],head=el[,2])
# handle un-directed edges by adding arc in each direction
if(!is.directed(nd)){
add.edges(projected,tail=el[,2],head=el[,1])
}
}
# loop again to create between-slice vertex self ties
# TODO: omit ties as indicated by vertex activity
for(s in 1:(length(nets)-1)){
tails<-1:network.size(nd)+network.size(nd)*(s-1)
heads<-1:network.size(nd)+network.size(nd)*s
add.edges(projected,tail=tails, head=heads )
}
return(projected)
}
This is pretty crude, doesn’t label stuff well, or tag the various edge types so they can be rendered differently.
Build the “tuning fork” network
m6<-network.initialize(6,directed=FALSE)
network.vertex.names(m6)<-LETTERS[1:6]
add.edges.active(m6,tail=1,head=2,onset=2,terminus=5)
add.edges.active(m6,tail=2,head=3,onset=3,terminus=7)
add.edges.active(m6,tail=3,head=4,onset=8,terminus=10)
add.edges.active(m6,tail=2,head=5,onset=0,terminus=1)
add.edges.active(m6,tail=5,head=6,onset=3,terminus=5)
plot(m6,displaylabels = TRUE,edge.label = get.edge.activity(m6))
Discretely re-sample it into a static network
m6projected<-timeProjectedNetwork(m6,start=0,end=10,time.increment=1)
plot with pre-defined grid coordinates for example
coords<-cbind(unlist(lapply(1:10,rep,6)),rep(1:6,10))
plot(m6projected,coord=coords,displaylabels=TRUE,jitter=FALSE)
The “Full multi-slice Representation” plot in Figure 2 of draft paper has a lot of implied vertex (in)-activity
plot with normal spatial layout algorithm
plot(m6projected,mode='kamadakawai',displaylabels = TRUE,vertex.cex=0.5)
Increase the temporal “resolution” by decreasing the bin size.
m6projectedFine<-timeProjectedNetwork(m6,start=0,end=10,time.increment=0.1)
coords<-cbind(unlist(lapply(1:100,rep,6)),rep(1:6,100))
plot(m6projectedFine,coord=coords,jitter=FALSE,
vertex.cex=0.2, usearrows=FALSE)
Try it out on the contact simulation example with varying times
First, lets do it with time steps of 100
data(moodyContactSim)
contactProjected100<-timeProjectedNetwork(moodyContactSim,start=0,end=1000,time.increment=100)
plot(contactProjected100,mode='kamadakawai',vertex.cex=0.5)
As Jim suggest, can also do it (more efficiently?) by only taking a network slices (bins) at the times that edges in the network actually change.
chChaChanges<-get.change.times(moodyContactSim)
contactProjectedChanges<-timeProjectedNetwork(moodyContactSim,onsets=chChaChanges,termini=chChaChanges)
plot(contactProjectedChanges,mode='kamadakawai',vertex.cex=0.5)
This example starts as network list, so have to make into an nD object first, which is kind of silly
data(newcomb,fig.width=8)
## Warning in data(newcomb, fig.width = 8): data set '8' not found
newcombNd<-networkDynamic(network.list=newcomb)
## Neither start or onsets specified, assuming start=0
## Onsets and termini not specified, assuming each network in network.list should have a discrete spell of length 1
## Argument base.net not specified, using first element of network.list instead
## Created net.obs.period to describe network
## Network observation period info:
## Number of observation spells: 1
## Maximal time range observed: 0 until 14
## Temporal mode: discrete
## Time unit: step
## Suggested time increment: 1
newcombProjected<-timeProjectedNetwork(newcombNd)
plot(newcombProjected,mode='kamadakawai',vertex.cex=0.5)
This is pretty well connected, so the individual time slices hold together well. Problem is that I think this is the raw rank data for Newcomb, so everyone always has the the same degree, and you can’t see the structure. Need to threshold first.
Use carter’s rgl 3d implementation to reproduce Moody’s 3d layouts (have to actually run this to see it in R’s 3D GL viewer!)
library(sna)
## sna: Tools for Social Network Analysis
## Version 2.3-2 created on 2014-01-13.
## copyright (c) 2005, Carter T. Butts, University of California-Irvine
## For citation information, type citation("sna").
## Type help(package="sna") to get started.
##
##
## Attaching package: 'sna'
##
## The following object is masked from 'package:network':
##
## %c%
library(rglwidget)
library(rgl)
open3d(antialias=1,windowRect=c(0,0,500,500))
## glX
## 1
gplot3d(contactProjectedChanges,edge.col='#55555555')
rglscene<-scene3d()
rgl.close()
rglwidget(rglscene)
(If RGL installed, will bring up a new 3d plot window. zoom in with mouse wheel, click and drag to rotate)
Should be pretty easy to use ndtv to generate smooth per-slice networks for force time oriented layout with each step as a layer.
First, load up data and construct the projected network
library(ndtv)
## Loading required package: animation
##
## ndtv: version 0.8, created on 2015-10-14
## Copyright (c) 2015, Skye Bender-deMoll, University of Washington
## with contributions from
## Martina Morris, University of Washington
## Based on "statnet" project software (statnet.org).
## For license and citation information see statnet.org/attribution
## or type citation("ndtv").
data(short.stergm.sim)
shortStergProj<-timeProjectedNetwork(short.stergm.sim)
plot(shortStergProj,mode='kamadakawai')
Next, do a 2d layout on the original (not time-projected) network as if we were going to make a movie using the same bin intervals. Extract those coordinates, assemble into an x,y,z matrix, and pass it into the gplot3d command.
# compute coords for each slice
compute.animation(short.stergm.sim)
## No slice.par found, using
## slice parameters:
## start:0
## end:25
## interval:1
## aggregate.dur:1
## rule:latest
## Calculating layout for network slice from time 0 to 1
## Calculating layout for network slice from time 1 to 2
## Calculating layout for network slice from time 2 to 3
## Calculating layout for network slice from time 3 to 4
## Calculating layout for network slice from time 4 to 5
## Calculating layout for network slice from time 5 to 6
## Calculating layout for network slice from time 6 to 7
## Calculating layout for network slice from time 7 to 8
## Calculating layout for network slice from time 8 to 9
## Calculating layout for network slice from time 9 to 10
## Calculating layout for network slice from time 10 to 11
## Calculating layout for network slice from time 11 to 12
## Calculating layout for network slice from time 12 to 13
## Calculating layout for network slice from time 13 to 14
## Calculating layout for network slice from time 14 to 15
## Calculating layout for network slice from time 15 to 16
## Calculating layout for network slice from time 16 to 17
## Calculating layout for network slice from time 17 to 18
## Calculating layout for network slice from time 18 to 19
## Calculating layout for network slice from time 19 to 20
## Calculating layout for network slice from time 20 to 21
## Calculating layout for network slice from time 21 to 22
## Calculating layout for network slice from time 22 to 23
## Calculating layout for network slice from time 23 to 24
## Calculating layout for network slice from time 24 to 25
## Calculating layout for network slice from time 25 to 26
# extract the coords it generated for each slice
xy<-lapply(0:24,function(t){
cbind(x=get.vertex.attribute.active(short.stergm.sim,'animation.x',at=t),
y=get.vertex.attribute.active(short.stergm.sim,'animation.y',at=t))
})
# smoosh into 3D coord array
xy<-do.call(rbind,xy)
xyz<-cbind(xy,z=unlist(lapply(0:24,rep,network.size(short.stergm.sim))))
head(xyz)
## x y z
## [1,] -0.7350049 -2.397288 0
## [2,] 0.7517537 -2.682371 0
## [3,] 1.0661296 1.070898 0
## [4,] -0.2447175 2.636865 0
## [5,] 0.6671756 2.002633 0
## [6,] 1.9857179 1.201906 0
# make it 3d
gplot3d(shortStergProj,coord=xyz,jitter=FALSE)
So it kind of works, but definitely needs to be able to visually distinguish the within-slice and between slice edges. Not very easy to trace out paths of individual vertices, but can see groups, as long as interact enough to keep the 3d plot moving so my brain can percive the 3rd dimension.
Contrast with proximity timeline view of the same network. This does a 1D spatial layout at each timestep on the y-axis, and temporal ordering along the x-axis.
proximity.timeline(short.stergm.sim,mode='sammon',default.dist=10,labels.at=17)
## collapsing slice networks ...
## computing vertex positions using 1D sammon layout ...
## computing positions for slice 1
## computing positions for slice 2
## computing positions for slice 3
## computing positions for slice 4
## computing positions for slice 5
## computing positions for slice 6
## computing positions for slice 7
## computing positions for slice 8
## computing positions for slice 9
## computing positions for slice 10
## computing positions for slice 11
## computing positions for slice 12
## computing positions for slice 13
## computing positions for slice 14
## computing positions for slice 15
## computing positions for slice 16
## computing positions for slice 17
## computing positions for slice 18
## computing positions for slice 19
## computing positions for slice 20
## computing positions for slice 21
## computing positions for slice 22
## computing positions for slice 23
## computing positions for slice 24
## computing positions for slice 25
## rendering splines for each vertex ...
I think this may be a bit easier to trace out individual vertex paths and see the integration of isolates and formation / seperation of groups. But the actuall times when vertices are tied are not directly plotted (only implied via proximity of lines)
In general, I find that all of these plotting techniques are not very generalizeable across classes of networks. A technique may work beautifully on one example, but fail misrably on second because some un-realized constraint is violated (i.e. planarity, maximum density, conectedness, etc.)
Enough pretty pictures …
Hypothesis: if we evaluate at every change step, geodesic distances on the time-projected graph will match earliest times in dynamic graph. If we use discrete bin intervals they will be different. But as bin intervals get smaller, it will approach the exact solution.
Compute the shortest path geodesics from the vertices in the first ‘wave’ to the vertices in the last ‘wave’. In order to make sure the paths are exactly comparable, we use the version where we generated a new ‘wave’ at each of the 35 points where anything changed in the network.
# this is the overall matrix
geodistMoodyContactChanges<-geodist(contactProjectedChanges)$gdist
# extract the distances between first and last wave
gdistChanges<-geodistMoodyContactChanges[1:16,545:560]
gdistChanges
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## [1,] 34 36 Inf 36 Inf 36 Inf 36 35 Inf 35 35 Inf
## [2,] 36 34 36 36 36 36 36 36 37 Inf 37 37 35
## [3,] Inf Inf 34 Inf 36 Inf 36 36 Inf Inf 37 Inf 35
## [4,] 38 36 36 34 36 38 36 36 39 35 37 39 35
## [5,] Inf Inf Inf Inf 34 Inf 36 36 Inf Inf 37 Inf 35
## [6,] Inf Inf Inf 36 Inf 34 Inf Inf Inf Inf Inf Inf Inf
## [7,] Inf Inf Inf Inf Inf Inf 34 Inf Inf 35 Inf Inf 35
## [8,] Inf Inf Inf Inf 36 Inf 36 34 Inf Inf 35 Inf 35
## [9,] 35 Inf Inf Inf Inf Inf Inf Inf 34 Inf Inf Inf Inf
## [10,] 39 37 37 35 37 39 35 37 40 34 38 40 36
## [11,] 35 37 Inf 37 Inf 37 Inf 35 36 Inf 34 36 Inf
## [12,] 35 Inf Inf Inf Inf Inf Inf Inf 36 Inf Inf 34 Inf
## [13,] 37 35 35 35 35 37 35 35 38 Inf 36 38 34
## [14,] Inf Inf Inf 35 Inf Inf Inf Inf Inf Inf Inf Inf Inf
## [15,] Inf 35 Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf
## [16,] 35 35 Inf 35 Inf 35 Inf Inf 36 Inf Inf 36 Inf
## [,14] [,15] [,16]
## [1,] Inf 37 35
## [2,] Inf 35 35
## [3,] Inf Inf Inf
## [4,] 35 37 35
## [5,] Inf Inf Inf
## [6,] Inf Inf 35
## [7,] Inf Inf Inf
## [8,] Inf Inf Inf
## [9,] Inf Inf Inf
## [10,] 36 38 36
## [11,] Inf 38 36
## [12,] Inf Inf Inf
## [13,] 36 36 36
## [14,] 34 Inf 36
## [15,] Inf 34 Inf
## [16,] Inf 36 34
(TODO: It is actually pretty wasteful to calculate the All Pairs Shortest Path on the projected network, since we actually only care about the distances from the first wave to the last, and it is evaluating all the in between ones as well. Be nice if I could tell the sna geodist function to only calculate paths between vertices 1:16,545:560 since we will subset to this anyway. )
Also compute the temporal distance of earliest arriving temporal path. Evaluate the tPath distances from each vertex at t=0 and aggregate into a matrix.
tdist<-t(sapply(1:16,function(v){
tPath(moodyContactSim,v=v)$tdist
}))
tdist
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## [1,] 0 543 Inf 708 Inf 661 Inf 709 634 Inf 174 581 Inf
## [2,] 543 0 594 708 672 661 729 679 634 Inf 709 581 454
## [3,] Inf Inf 0 Inf 672 Inf 729 679 Inf Inf 709 Inf 594
## [4,] 543 454 594 0 672 661 184 679 634 0 709 581 413
## [5,] Inf Inf Inf Inf 0 Inf 729 679 Inf Inf 709 Inf 672
## [6,] Inf Inf Inf 708 Inf 0 Inf Inf Inf Inf Inf Inf Inf
## [7,] Inf Inf Inf Inf Inf Inf 0 Inf Inf 184 Inf Inf 729
## [8,] Inf Inf Inf Inf 679 Inf 729 0 Inf Inf 709 Inf 679
## [9,] 634 Inf Inf Inf Inf Inf Inf Inf 0 Inf Inf Inf Inf
## [10,] 543 454 594 0 672 661 184 679 634 0 709 581 413
## [11,] 174 543 Inf 708 Inf 661 Inf 709 634 Inf 0 581 Inf
## [12,] 581 Inf Inf Inf Inf Inf Inf Inf 634 Inf Inf 0 Inf
## [13,] 543 454 594 413 672 661 729 679 634 Inf 709 581 0
## [14,] Inf Inf Inf 625 Inf Inf Inf Inf Inf Inf Inf Inf Inf
## [15,] Inf 669 Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf
## [16,] 543 535 Inf 708 Inf 661 Inf Inf 634 Inf Inf 581 Inf
## [,14] [,15] [,16]
## [1,] Inf 669 543
## [2,] Inf 669 535
## [3,] Inf Inf Inf
## [4,] 625 669 535
## [5,] Inf Inf Inf
## [6,] Inf Inf 661
## [7,] Inf Inf Inf
## [8,] Inf Inf Inf
## [9,] Inf Inf Inf
## [10,] 625 669 535
## [11,] Inf 669 543
## [12,] Inf Inf Inf
## [13,] 625 669 535
## [14,] 0 Inf 708
## [15,] Inf 0 Inf
## [16,] Inf 669 0
First level of check: Do they find the same set of reachable vertices? Check that all the Inf match up between the matrices?
all(is.infinite(gdistChanges)==is.infinite(tdist))
## [1] TRUE
Great! However, it is a bit hard to directly compare distances with these versions, because the interval of time between waves in the ‘changes’ version are not the same at each time step. Also the paths selected to get there may not be the same..
Plot all the the distances/times for each cell in the matrix against each other
plot(as.numeric(gdistChanges),as.numeric(tdist))
As expected, they don’t line up exactly. Note also that the x-axis begins at 34, because the shortest path distance from a vertex at t=0 to itself at t=1000 is 34 identity-arc hops. We can subtract 34 from the distancs to make them eaiser to compare.
What about if we compare the number of steps in the lengths of the paths they find
tdistGHops<-t(sapply(1:16,function(v){
tPath(moodyContactSim,v=v)$gsteps
}))
plot(as.numeric(gdistChanges)-34,as.numeric(tdistGHops),main='tdist hops vs changes hops')
Which paths are different?
tdistGHops==gdistChanges-34
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## [1,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [3,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [4,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [5,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [6,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [7,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [8,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [9,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [10,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [11,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [12,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [13,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [14,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [15,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [16,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [,14] [,15] [,16]
## [1,] TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE
## [3,] TRUE TRUE TRUE
## [4,] TRUE TRUE FALSE
## [5,] TRUE TRUE TRUE
## [6,] TRUE TRUE TRUE
## [7,] TRUE TRUE TRUE
## [8,] TRUE TRUE TRUE
## [9,] TRUE TRUE TRUE
## [10,] TRUE TRUE FALSE
## [11,] TRUE TRUE TRUE
## [12,] TRUE TRUE TRUE
## [13,] TRUE TRUE TRUE
## [14,] TRUE TRUE TRUE
## [15,] TRUE TRUE TRUE
## [16,] TRUE TRUE TRUE
Looks like 4 to 16, and 10 to 16
tdistGHops[4,16]
## [1] 3
gdistChanges[4,16]-34
## [1] 1
tdistGHops[10,16]
## [1] 4
gdistChanges[10,16]-34
## [1] 2
Plot our favorite example network to see what is going on
plot(moodyContactSim,
displaylabels=TRUE,
label.cex=0.8,
label.pos=5,
vertex.col='white',
vertex.cex=3,
edge.label=sapply(get.edge.activity(moodyContactSim),function(e){
paste('(',e[,1],'-',e[,2],')',sep='')
}),
edge.label.col='blue',
edge.label.cex=0.7,
main='moodyContactSim network'
)
Both of theses cases are where the shortest temporal path does not equal the earliest temporal path. The path with the fewest number of graph hops arrives later. For example, 4-16 is the shortest path (arriving at t=748), but the earliest is 4-13-2-16 (arriving at t=575).
Construct a version where we aggregate the network in 10 waves of 100 time units. Pretty sure that this is going to cause some distortion, so this will be the straw-man example.
# this is the overall matrix
geodistMoodyContact100<-geodist(contactProjected100)$gdist
# extract the distances between first and last wave
gdist100<-geodistMoodyContact100[1:16,145:160]
And a version where we aggregate in 20 waves of 50 time units
# this is the overall matrix
contactProjected50<-timeProjectedNetwork(moodyContactSim,time.increment=50)
geodistMoodyContact50<-geodist(contactProjected50)$gdist
# extract the distances between first and last wave
gdist50<-geodistMoodyContact50[1:16,305:320]
plot(as.numeric(gdist100),as.numeric(gdist50),main='100step aggregation vs 50step')
In this version, the times align (they differ by constant multiple because the we have not weighted the identity edges in any way to account for the differences in temporal length)
Lets do it so we construct a time projected network with a 1000 waves, one for every single discrete time point (this will take a while)
# this is the overall matrix
contactProjected1<-timeProjectedNetwork(moodyContactSim,time.increment=1) # <-- this object is 13Mb
geodistMoodyContact1<-geodist(contactProjected1)$gdist
Can’t run version above because it gives “Error: cannot allocate vector of size 1.9Gb” when it tries to allocate the matrix for APSP geodist. Since we can’t make matrices that big, lets try every 5 time steps as an approximation.
# this is the overall matrix
contactProjected5<-timeProjectedNetwork(moodyContactSim,time.increment=5) # <-- this object is 2.6 Mb
geodistMoodyContact5<-geodist(contactProjected5)$gdist
# extract the distances between first and last wave
gdist5<-geodistMoodyContact5[1:16,3185:3200]
compare time increment 5 to 100
plot(as.numeric(gdist100),as.numeric(gdist5),main='100step aggregation vs 5step')
compare 5 to ‘changes’ version
plot(as.numeric(gdist5),as.numeric(gdistChanges),main='5step aggregation vs exact changes')
The 5-time unit aggregation matches the “exact” (changes) version.
plot(as.numeric(gdist100),as.numeric(gdistChanges), main='100step aggregation vs exact changes')
But, not surprisingly, the 100 time unit aggregation, seems to under-estimate some distances, presumably because the aggregation makes some paths seem feasible that are not. This makes sense, as the duration of most edges is ~30 in this simple example.
The 100 unit aggregation doesn’t just get the paths lengths wrong, it also over-estimates the forward reachable set.
sum(!is.infinite(gdist100)) # how many non-Inf distances are there?
## [1] 147
sum(!is.infinite(gdist5)) # how many non-Inf distances are there?
## [1] 121
So,
As always when slicing up a continuous network into discrete samples, we have the choice between sampling ‘at’ a time point or aggregating over the entire window between samples (the default in the examples above, with the exception of changes). The former will nearly always underestimate ‘true’ connectivity because it can miss ties that occur between sampling periods, and the later will overestimate (as Jim mentions) because it will collapse the relative ordering of ties occurring within the bin, making them appear concurrent when they may not be.
Making the network discrete is generally going to bias the temporal information one way or another. So the time projected is probably more useful if you have data that is only collected in a few survey waves. However, if the focus is transmission phenomena, and we collect data in waves (w/o asking for explicit timing on ties within waves), and the survey times are far enough apart that there is more than one tie change per wave, we have already lost some of the ordering information needed for transmission anyway and maybe shouldn’t be using these metrics.
I think that the time projected network is extremely useful for pedagogical purpose and thinking through various measures.
But I’d also argue that as we increase the accuracy by letting the time interval between slices approach the limit of zero, and the redundent information is removed via compression, the representation is becoming closer and closer to the continuous time spell-based representation used in networkDynamic. And in general the spell representation gives an efficient way to compute things since we can build algorithms that can use the spell information to operate as if we have constructed the time projected network without allocating giant matrices.